While mining data, it is desirable to link databases and extract information from them. But often the data are disparate, even data that are related to a single well-defined domain or subject area. The majority of real-life data coming from different data sources is not suitable for deterministic record linkage because no unique identifiers of high quality, such as social security numbers, are available.
Other factors also contribute to the difficulties in data linking, including different standards, different schemas, different formats, various errors, inconsistencies, and out-of-date data.
Where considered appropriate, reference numerals may be repeated among the drawings to indicate corresponding or analogous elements. Moreover, some of the blocks depicted in the drawings may be combined into a single function.